This paper describes a novel method by which a spoken dialogue system canlearn to choose an optimal dialogue strategy from its experience interactingwith human users. The method is based on a combination of reinforcementlearning and performance modeling of spoken dialogue systems. The reinforcementlearning component applies Q-learning (Watkins, 1989), while the performancemodeling component applies the PARADISE evaluation framework (Walker et al.,1997) to learn the performance function (reward) used in reinforcementlearning. We illustrate the method with a spoken dialogue system named ELVIS(EmaiL Voice Interactive System), that supports access to email over the phone.We conduct a set of experiments for training an optimal dialogue strategy on acorpus of 219 dialogues in which human users interact with ELVIS over thephone. We then test that strategy on a corpus of 18 dialogues. We show thatELVIS can learn to optimize its strategy selection for agent initiative, forreading messages, and for summarizing email folders.
展开▼